Ignore ROCm-LLVM on aarch64#223
Conversation
There was a problem hiding this comment.
Why would we do this? If it is not supported at all for Arm then why create modules?
The EESSI module will not expose a MODULEPATH that does not exist. What I would suggest is that we instead print a message in the EESSI module when we see aarch64 and an AMD GPU.
|
What I mean is, this shouldn't be a warning but an error. |
|
My assumption here is |
| msg += "You can override this behaviour by setting the EESSI_OVERRIDE_ROCM_VERSION_CHECK environment variable." | ||
| print_warning(msg) | ||
| var=EESSI_IGNORE_AARCH64_ROCMLLVM641_ENVVAR | ||
| setattr(self, EESSI_UNSUPPORTED_MODULE_ATTR, UnsupportedModule(envvar=var, errmsg=errmsg)) |
There was a problem hiding this comment.
We should define a sensible errmsg before this line. Note that this is the errmsg that gets printed to the end user trying to load the module. Thus, it should make sense to such an end-user...
It is.
You get a warning at install time that this will trigger a
This is another option. I'm ok with this as well. It depends a bit on your philosophy: if we want all module environments to look identical in terms of which modules are present, we should take the One other reason I'd have for opting for the |
|
This is not the same as the zen4 case, that was a CPU toolchain that didn't work on that CPU. In this case this is an accelerator that will will never be matched with a CPU. Just like A64FX+CUDA, there is no reason for those modules to exist as they will never work |
|
Think of it in terms of what usable modules are in the |
ROCm-LLVM 6.4.1 is not supported on aarch64 family of CPUs.
See: EESSI/software-layer#1473 (comment)